Increasing UMLS Coverage and Reducing Ambiguity via Automated Creation of Synonymous Terms: First Steps toward Filling UMLS Synonymy Gaps
نویسندگان
چکیده
Background: Although extensive synonymy is one of the greatest strengths of the UMLS Metathesaurus, much research has nonetheless focused on identifying and measuring gaps in UMLS synonymy. This paper proposes a methodology for further extending the UMLS’ already rich synonymy by semi-automatically creating new strings not in the UMLS, and including them as additional synonymous strings within existing UMLS concepts. Results: In this paper we present our methodology for identifying missing UMLS synonymy and semi-automatically creating synonyms to fill these gaps. We created an enhanced Metathesaurus supplemented by these strings, and improved the performance on both biomedical literature and clinical text of two well known named-entity-recognition applications at the US National Library of Medicine, MetaMap and the Medical Text Indexer (MTI). Conclusions: Our methods propose first steps toward extending the already rich synonymy of the UMLS by filling in some synonymy gaps. We further theorize that some of the newly created strings could also be used to extend the Medical Subject Headings (MeSH) entry terms, and thereby enhance MEDLINE indexing and PubMed queries by better reflecting how authors actually refer to biomedical concepts in the literature.
منابع مشابه
Battling Scylla and Charybdis: the search for redundancy and ambiguity in the 2001 UMLS metathesaurus
I previously developed methods for identifying cases of multiple synonymous concepts (redundancy) and concepts with multiple meanings (ambiguity) and applied them to the 1995 UMLS Metathesaurus. These methods use semantic approaches (including knowledge about word synonymy and the semantic types assigned to concepts) to complement the standard lexical approaches. In this paper, I describe the r...
متن کاملUtilizing the UMLS for Semantic Mapping between Terminologies
An algorithm was derived to find candidate mappings between any two terminologies inside the UMLS, making use of synonymy, explicit mapping relations and hierarchical relationships among UMLS concepts. Using an existing set of mappings from SNOMED CT to ICD9CM as our gold standard, we managed to find candidate mappings for 86% of SNOMED CT terms, with recall of 42% and precision of 20%. Among t...
متن کاملSheffield University and the TREC 2004 Genomics Track: Query Expansion Using Synonymous Terms
In this paper we describe our approach to the Ad Hoc Retrieval task of the TREC 2004 Genomics Track. This is a conventional searching task based on a 10-year subset of MEDLINE (about 4.5 million documents and 9 gigabytes in size) and 50 topics derived from information needs obtained via interviews of real biomedical researchers. We will also discuss the results of our submitted runs. The hypoth...
متن کاملA unified representation of findings in clinical radiology using the UMLS and DICOM
PURPOSE Collecting and analyzing findings constitute the basis of medical activity. Computer assisted medical activity raises the problem of modelling findings. We propose a unified representation of findings integrating the representations of findings in the GAMUTS in Radiology [M.M. Reeder, B. Felson, GAMUTS in radiology Comprehensive lists of roentgen differential diagnosis, fourth ed., 2003...
متن کاملConcepts and Synonymy in the UMLS Metathesaurus
This paper advances a detailed exploration of the complex relationships among terms, concepts, and synonymy in the UMLS Metathesaurus, and proposes the study and understanding of the Metathesaurus from a model-theoretic perspective. Initial sections provide the background and motivation for such an approach, and a careful informal treatment of these notions is offered as a context and basis for...
متن کامل